This report present the computational results for the instances. All instances are minimized, i.e. if an objective function \(z(x)\) is maximized, we minimize \(-z(x)\) instead. Note that in all instances only binary variables are considered.

Instances

We consider a total of 2017 instances named

Forget20_[problemClass]_[n]_[p]_[rangeOfCosts]_[costGenerationMethod]_[constaintId]_[id].raw

where

The cost generation method for generating the coefficients of the objective functions are (given in column coef):

For further details see the documentation function genSample in the R package gMOIP.

More detailed results for each instance can be seen in the files named [instanceName]_results.html (see for instance this example).

Input and output statistics

A set of statistics are stored for each instance and algorithm run. The statistics are stored in a json result file (v1.0) for each instance and algorithm configuration named [Instance name]_[Algorithm config]_result.json. Most statistics are aggregated in file stat.csv.

Input statistics

For each instance we have the following statistics:

We have:

  • instance: Instance name
  • pb: Problem class
    • KP = Knapsack Problem
    • AP = Assignment Problem
    • UFLP = Uncapacited Facility Location Problem
  • n: Number of variables.
  • p: Number of objectives.
  • coef: The method used for the generation of the objective coefficients (see section Research questions). We have:
    • random = random coefficient generation
    • 2box = two-boxes generation
    • spheredown = generation on the lower part of a sphere (in minimization)
    • sphereup = generation on the upper part of a sphere (in minimization)
  • rangeC: the range the objective coefficients are generated within. For the facility location problems, two sets of costs are generated: the assignment costs (\(c_{ij}\)) and the facility opening costs (\(f_j\)). It is reasonable to assume that these two sets of costs may not take their values in the same range. In this case rangeC defines two ranges: the assignment costs in \([a,b]\) and the facility opening costs which will be divided into two categories of instances:
    • facility location easy: \(f_j \in [b+a,b+b]\). It is called “easy” because it seems that having high facility opening costs makes the problem easier.
    • facility location hard: \(f_j \in [0.1a,0.1b]\). It is called “hard” because it seems that having small facility opening costs makes the problem harder. rangeGapC: Gap of the objective coefficients.
  • ratioNDcoef: Proportion of objective coefficient vectors (here considered as a vector in \(\mathbb{R}^p\), one for each variable, defining the objective coefficient for all the objective for this variable) that are non-dominated (with respect to the other objective coefficient vectors). Examples of the four ways of generating the objective coefficients are given:

Random

The coefficient are generated randomly with a uniform distribution in the range \([a,b]\). For three objectives, random generated coefficients looks as follows.

You must enable Javascript to view this page properly.

Sphere down

The coefficients are generated on the lower part of a sphere (see next picture). Note that the sphere is adjusted such that the coefficients are in the range \([a,b]\), i.e. the sphere is not necessarily included in \([a,b]^p\).

You must enable Javascript to view this page properly.

Sphere up

The coefficients are generated on the upper part of a sphere (see next picture). Note that the sphere is adjusted such that the coefficients are in the range \([a,b]\), i.e. the sphere is not necessarily included in \([a,b]^p\).

You must enable Javascript to view this page properly.

2box

The coefficients are generated randomly but in two specific parts of \([a,b]^p\) (see next picture).

You must enable Javascript to view this page properly.

Algorithm configuration

The branch and bound algorithm will always use the linear relaxation as lower bound set and the incumbent set as upper bound set. At each nodes, the extreme points of the lower bound set are checked for integrality and possibly added to the upper bound set. Different configurations are bases on:

  • nodesel: node selection strategy.
  • varsel: variable selection strategy.
  • OB: objective branching strategy.

Combinations of nodesel, varsel and OB gives the configuration of the algorithm:

  • Node selection (breadth, depth): in the multi-objective branch and bound literature, depth first strategies are (almost) always used and are considered as better strategies than breadth first. However, if a problem with an “easy” single-objective version is solved, then many non-dominated points may be found in a node close to the root node, in the early stages of the tree. Hence, using a breadth first strategy may provide a better upper bound set earlier in the algorithm, as a larger number of points are expected to be found shallow in the tree while only a few points can be found at each node deep in the tree.

  • Variable selection(mof, mfavg): to the best of our knowledge, no extensive study for variable selection has been conducted in the literature. Sometimes, this component is even not mentioned. As it is not the main purpose of this study, only two rules that rely on the characteristics of the lower bound set will be tested here.

    • mof: Most Often Fractional. The branching is operated on the variable that is the most often fractional among the extreme points of the lower bound set.

    • mfavg: Most Fractional on AVeraGe. The branching is operated on the variable such that its average value among the extreme points of the lower bound set is the most fractional, i.e. the closest to 0.5.

  • Branching scheme (none, exact, cone): in a regular branch and bound, branching is operated (i.e. sub-problems are created) in the decision space. In the bi-objective case, it has been shown that creating additional sub-problems in the objective space (procedure called objective branching in this study) leads to better computational times. Three versions of the branch and bound will be tested:

    • None: no objective branching is performed. Hence, this version is just a regular branch and bound.

    • exact: objective branching is performed using the algorithm from [master thesis paper].

    • unique cone: a unique sub-problem is created at each node. In this case, objective branching is applied on the nadir point of the local upper bounds dominated by the lower bound set. See [master thesis paper] for more details.

Output statistics

For each algorithm run we have the following statistics:

  • solved: 1 if the instance is solved within 3600 sec, 0 otherwise.
  • YN: size of YN. If solved = 0, it represent the size of the upper bound set at 3600 sec, when the algorithm stops.
  • YNse: Number of supported extreme nondominated points.
  • YNs: Number of supported nondominated points.
  • YNsRatio: Ratio of supported nondominated points.
  • YNusRatio: Ratio of unsupported nondominated points.
  • YNsneRatio: Ratio of supported non-extreme nondominated points.
  • nbnodes: number of nodes explored.
  • mindepthT: minimal depth of a leaf node.
  • maxdepthT: maximal depth of a leaf node (and thus of the tree).
  • avgdepthT: average depth of the leaf nodes.
  • avgdepthYN: average depth of the nodes where the non-dominated points were found.
  • nbleaf: number of leaf nodes.
  • nbinfeas: number of nodes pruned by infeasibility.
  • pctinfeas: proportion (in %) of leaf nodes pruned by infeasibility.
  • tpsinfeas: average time spend to prune a node by infeasibility (in msec).
  • nbopt: number of nodes pruned by optimality.
  • pctopt: proportion (in %) of leaf nodes pruned by optimality.
  • tpsopt: average time spend to prune a node by optimality (in msec).
  • nbdomi: number of nodes pruned by dominance.
  • pctdomi: proportion (in %) of leaf nodes pruned by dominance.
  • avgdomi: average time spend to prune a node by dominance (in msec).
  • nbLB: number of lower bound set computed.
  • avgfacets: average number of facets in the lower bound set (i.e. in \(\mathcal{L} + \mathbb{R}^p\)).
  • avgNDf: average number of strictly non-dominated facets.
  • pctavgNDf: proportion (in %) of facets that are strictly non-dominated.
  • avgWNDf: average number of weekly non-dominated facets.
  • pctavgWNDf: proportion (in %) of facets that are weekly non-dominated.
  • maxfacets: maximal number of facets a lower bound set had in the tree.
  • maxNDf: number of strictly non-dominated facets in the lower bound set with the maximal number of facets.
  • pctmaxNDf: proportion (in %) of facets that are strictly non-dominated in the lower bound set with the maximal number of facets.
  • maxWNDf: number of weekly non-dominated facets in the lower bound set with the maximal number of facets.
  • pctmaxWNDf: proportion (in %) of facets that are weekly non-dominated in the lower bound set with the maximal number of facets.
  • tpstotal: CPU time (in sec) used to solve the instance. 3600 if the instance is not solved.
  • tpsLB: CPU time (in sec) used to compute lower bound sets.
  • pcttpsLB: proportion (in %) of the total CPU time spend in the computation of lower bound sets.
  • tpsdomi: CPU time (in sec) used to dominance test when the algorithm has to determine whether a node can be pruned by dominance or not.
  • pcttpsdomi: proportion (in %) of the total CPU time spend in the dominance test when the algorithm has to determine whether a node can be pruned by dominance or not.
  • tpsUB: CPU time (in sec) used to update the upper bound set.
  • pcttpsUB: proportion (in %) of the total CPU time spend in updating the upper bound set.
  • tpsnodesel: CPU time (in sec) used to choose the next node to develop.
  • pcttpsnodesel: proportion (in %) of the total CPU time spend in choosing the next node to develop.
  • tpsvarsel: CPU time (in sec) used to choose the variable to branch on.
  • pcttpsvarsel: proportion (in %) of the total CPU time spend in choosing the variable to branch on.
  • tpsOB: CPU time (in sec) used to create the sub-problems in the objective space, i.e. to compute objective branching. (/! it requires two different steps in total: computing the SLUBs but also do additional dominance test to determine the dominance status of each local upper bounds ! This number take into account the two steps.)
  • pcttpsOB: proportion (in %) of the total CPU time spend in computing objective branching.
  • tpsSLUB: CPU time (in sec) used to compute the super local upper bounds.
  • pcttpsSLUB: proportion (in %) of the total CPU time spend in computing the super local upper bounds.
  • tpsdomiLUB: CPU time (in sec) used to do the additional dominance tests to get the dominance status of EACH local upper bound.
  • pcttpsdomiLUB: proportion (in %) of the total CPU time spend in doing the additional dominance tests on the local upper bounds.
  • nbOB: number of nodes where two or more sub-problems are created in the objective space. When using the exact objective branching (OB = exact), it in particular shows how often it is actually possible to split the objective space with the definition of the sub-problems that we used (\(z(x) \leqq \bar{z}\), \(\bar{z} \in \mathbb{R}^p\)).
  • pctnbOB: proportion (in %) of the nodes explored that where split in two or more sub-problems in the objective space.
  • avgdepthOB: [relevant only if OB = exact] average depth of the nodes split in two or more sub-problems in the objective space.
  • mindepthOB: [relevant only if OB = exact] minimal depth of the nodes split in two or more sub-problems in the objective space.
  • maxdepthOB: [relevant only if OB = exact] maximal depth of the nodes split in two or more sub-problems in the objective space.
  • avgnbpbOB: [relevant only if OB = exact] average number of sub-problems created in the objective space in the nodes split in two or more sub-problems in the objective space.
  • maxnbpbOB: [relevant only if OB = exact] average number of sub-problems created in the objective space in the nodes split in two or more sub-problems in the objective space.

For each instance we also store the non-dominated set \(\mathcal{Y}_N\) found by the algorithm and if an exact solution was found the efficient set \(\mathcal{X}_E\). Statistics of how the nondominated points are found are stored in yNStat. The statistics are the following:

  • the \(p\) first columns correspond the values of the objective functions.
  • node: number of the node where this point has been discovered. The higher this number is, the later the point has been discovered.
  • time: time (in sec) elapsed between the start of the algorithm and when this point has been found (for the first time).
  • depth: depth of the node where this point has been found (for the first time).

yNStat are sorted in exactly the same order as \(\mathcal{X}_E\), i.e. each row represent the non-dominated point and its corresponding solution.

Detailed results for each instance

Detailed results for each instance can be generated using instance.Rmd. The report is already generated for some of the instances (see links in the table with input statistics). Some instances that might be of interest:

Numerical instablities

In order to compute the linear relaxation at each node, the solver Bensolve is used. To avoid parsing a lot a files at each node in our implementation, a wrapper that calls bensolve, retrieve some outputs and introduce them directly in our code as matrices, has been made.

The retrieved outputs are the extreme rays, the extreme points (in the objective space), their pre-image (in the decision space) and a list of extreme points and rays that constitute each facet of the lower bound set.

To proceed the dominance test in the branch and bound, the normal vector of each facet of the lower bound set is required. At this point, as only a description using extreme points is available, the normal vectors will be computed using linear algebra. From the extreme points and extreme rays of a facet \(f\) given by Bensolve, it is always possible to determine \(p\) linearly independant points \(x^1,...,x^p\) of this facet. To obtain the normal vector of \(f\), the system \(M \cdot n^f = 0\) is solved, where \(n^f\) is the normal vector of \(f\) and \(M\) is a matrix such that each row \(M_i = x^1 - x^p\), \(\forall i \in \{1,...,p-1\}\) and \(M_p = (1,...,1)\).

However, this system is sometimes close to be singular, leading to numerical instabilities in the components of the normal vectors. This imply that occasionally, a facet may be slightly not correctly oriented, which may be a problem in the dominance test (e.g. considering a local upper bound as non-dominated while it is dominated, and the other way around).

Two equivalent epsilons can be introduced here in order to overcome these numerical instabilities. First, the facet such that the determinant of their matrix \(M\) is too close to 0 are discarded. In other words, if \(det(M) \leq \epsilon\), the facet is deleted. The other possibility is to move down the whole lower bound set by reducing from an epsilon the right-hand side of the equations of the facets, i.e. the equation of each facet \(f \in \mathcal{L}(\eta)\) becomes \(n^fy = d - \epsilon\) instead of \(n^fy = d\). Both method produce weaker lower bound set, leading to greater computational times, but reducing the chances of missing a non-dominated point due to numerical instabilities.

Very rarely, GLPK, the single-objective LP solver used by Bensolve, rises a warning for numerical instabilities as well. Unfortunately, there is no way for us to avoid these, but fortunately, these cases occur much less often.

The list of instances where a different number of non-dominated points in some of the configurations has been found due to numerical instabilities can be found below:

Currently we have instabilities with instances: Forget20-AP_6_3_1-10_random_1_5, Forget20-AP_6_3_1-10_spheredown_1_1, Forget20-AP_6_3_1-10_spheredown_1_4, Forget20-AP_6_3_1-10_spheredown_1_9, Forget20-AP_6_3_1-10_sphereup_1_4, Forget20-AP_6_3_1-10_sphereup_1_6, Forget20-AP_6_3_1-10_sphereup_1_7, Forget20-AP_6_3_1-10_sphereup_1_9, Forget20-AP_7_3_1-10_2box_1_2, Forget20-AP_7_3_1-10_2box_1_4, Forget20-AP_7_3_1-10_random_1_8, Forget20-AP_7_3_1-10_spheredown_1_1, Forget20-AP_7_3_1-10_spheredown_1_3, Forget20-AP_7_3_1-10_spheredown_1_4, Forget20-AP_7_3_1-10_spheredown_1_5, Forget20-AP_7_3_1-10_spheredown_1_7, Forget20-AP_7_3_1-10_spheredown_1_8, Forget20-AP_7_3_1-10_spheredown_1_9, Forget20-AP_7_3_1-10_sphereup_1_10, Forget20-AP_7_3_1-10_sphereup_1_2, Forget20-AP_7_3_1-10_sphereup_1_4, Forget20-AP_7_3_1-10_sphereup_1_5, Forget20-AP_7_3_1-10_sphereup_1_6, Forget20-AP_7_3_1-10_sphereup_1_7, Forget20-AP_7_3_1-10_sphereup_1_8, Forget20-AP_8_3_1-10_2box_1_2, Forget20-AP_8_3_1-10_2box_1_3, Forget20-AP_8_3_1-10_2box_1_4, Forget20-AP_8_3_1-10_2box_1_5, Forget20-AP_8_3_1-10_2box_1_9, Forget20-AP_8_3_1-10_random_1_6, Forget20-AP_8_3_1-10_random_1_9, Forget20-AP_8_3_1-10_spheredown_1_1, Forget20-AP_8_3_1-10_spheredown_1_10, Forget20-AP_8_3_1-10_spheredown_1_2, Forget20-AP_8_3_1-10_spheredown_1_3, Forget20-AP_8_3_1-10_spheredown_1_4, Forget20-AP_8_3_1-10_spheredown_1_5, Forget20-AP_8_3_1-10_spheredown_1_6, Forget20-AP_8_3_1-10_spheredown_1_7, Forget20-AP_8_3_1-10_spheredown_1_9, Forget20-AP_8_3_1-10_sphereup_1_1, Forget20-AP_8_3_1-10_sphereup_1_10, Forget20-AP_8_3_1-10_sphereup_1_2, Forget20-AP_8_3_1-10_sphereup_1_3, Forget20-AP_8_3_1-10_sphereup_1_4, Forget20-AP_8_3_1-10_sphereup_1_5, Forget20-AP_8_3_1-10_sphereup_1_6, Forget20-AP_8_3_1-10_sphereup_1_7, Forget20-AP_8_3_1-10_sphereup_1_8, Forget20-AP_8_3_1-10_sphereup_1_9, Forget20-AP_8_3_1-1000_spheredown_1_10, Forget20-AP_8_3_1-1000_spheredown_1_2, Forget20-AP_8_3_1-1000_spheredown_1_3, Forget20-AP_8_3_1000-2000_spheredown_1_7, Forget20-AP_8_3_1000-2000_spheredown_1_9, Forget20-AP_8_3_1000-2000_sphereup_1_2, Forget20-AP_8_3_1000-2000_sphereup_1_6, Forget20-AP_8_3_1000-2000_sphereup_1_9, Forget20-UFLP_10_3_1-10_11-20_sphereup_1_2, Forget20-UFLP_12_3_1-10_11-20_random_1_4, Forget20-UFLP_12_3_1-10_11-20_spheredown_1_7, Forget20-UFLP_12_3_1-10_11-20_sphereup_1_5, Forget20-UFLP_12_3_1-10_11-20_sphereup_1_8, Forget20-UFLP_14_3_1-10_11-20_2box_1_8, Forget20-UFLP_14_3_1-10_11-20_random_1_1, Forget20-UFLP_14_3_1-10_11-20_random_1_10, Forget20-UFLP_14_3_1-10_11-20_random_1_2, Forget20-UFLP_14_3_1-10_11-20_random_1_6, Forget20-UFLP_14_3_1-10_11-20_spheredown_1_1, Forget20-UFLP_14_3_1-10_11-20_spheredown_1_10, Forget20-UFLP_14_3_1-10_11-20_spheredown_1_4, Forget20-UFLP_14_3_1-10_11-20_spheredown_1_7, Forget20-UFLP_14_3_1-10_11-20_sphereup_1_10, Forget20-UFLP_14_3_1-10_11-20_sphereup_1_3, Forget20-UFLP_14_3_1-10_11-20_sphereup_1_4, Forget20-UFLP_14_3_1-10_11-20_sphereup_1_5, Forget20-UFLP_14_3_1-10_11-20_sphereup_1_6, Forget20-UFLP_14_3_1-10_11-20_sphereup_1_9, Forget20-UFLP_14_3_1-1000_1001-2000_spheredown_1_2, Forget20-UFLP_6_3_1-10_1-1_2box_1_1, Forget20-UFLP_6_3_1-10_1-1_2box_1_5, Forget20-UFLP_6_3_1-10_1-1_2box_1_8, Forget20-UFLP_6_3_1-10_1-1_2box_1_9, Forget20-UFLP_6_3_1-10_1-1_random_1_2, Forget20-UFLP_6_3_1-10_1-1_random_1_4, Forget20-UFLP_6_3_1-10_1-1_random_1_5, Forget20-UFLP_6_3_1-10_1-1_random_1_7, Forget20-UFLP_6_3_1-10_1-1_random_1_9, Forget20-UFLP_6_3_1-10_1-1_spheredown_1_1, Forget20-UFLP_6_3_1-10_1-1_spheredown_1_10, Forget20-UFLP_6_3_1-10_1-1_spheredown_1_2, Forget20-UFLP_6_3_1-10_1-1_spheredown_1_3, Forget20-UFLP_6_3_1-10_1-1_spheredown_1_4, Forget20-UFLP_6_3_1-10_1-1_spheredown_1_5, Forget20-UFLP_6_3_1-10_1-1_spheredown_1_6, Forget20-UFLP_6_3_1-10_1-1_spheredown_1_7, Forget20-UFLP_6_3_1-10_1-1_spheredown_1_8, Forget20-UFLP_6_3_1-10_1-1_spheredown_1_9, Forget20-UFLP_6_3_1-10_1-1_sphereup_1_1, Forget20-UFLP_6_3_1-10_1-1_sphereup_1_10, Forget20-UFLP_6_3_1-10_1-1_sphereup_1_2, Forget20-UFLP_6_3_1-10_1-1_sphereup_1_3, Forget20-UFLP_6_3_1-10_1-1_sphereup_1_4, Forget20-UFLP_6_3_1-10_1-1_sphereup_1_5, Forget20-UFLP_6_3_1-10_1-1_sphereup_1_6, Forget20-UFLP_6_3_1-10_1-1_sphereup_1_7, Forget20-UFLP_6_3_1-10_1-1_sphereup_1_8, Forget20-UFLP_6_3_1-10_1-1_sphereup_1_9, Forget20-UFLP_6_3_1-1000_1-100_spheredown_1_10, Forget20-UFLP_6_3_1-1000_1-100_spheredown_1_3, Forget20-UFLP_6_3_1-1000_1-100_spheredown_1_8, Forget20-UFLP_6_3_1-1000_1-100_spheredown_1_9, Forget20-UFLP_6_3_1-1000_1-100_sphereup_1_8, Forget20-UFLP_7_3_1-10_1-1_2box_1_10, Forget20-UFLP_7_3_1-10_1-1_2box_1_2, Forget20-UFLP_7_3_1-10_1-1_2box_1_3, Forget20-UFLP_7_3_1-10_1-1_2box_1_4, Forget20-UFLP_7_3_1-10_1-1_2box_1_7, Forget20-UFLP_7_3_1-10_1-1_2box_1_8, Forget20-UFLP_7_3_1-10_1-1_random_1_1, Forget20-UFLP_7_3_1-10_1-1_random_1_2, Forget20-UFLP_7_3_1-10_1-1_random_1_3, Forget20-UFLP_7_3_1-10_1-1_random_1_4, Forget20-UFLP_7_3_1-10_1-1_random_1_6, Forget20-UFLP_7_3_1-10_1-1_random_1_7, Forget20-UFLP_7_3_1-10_1-1_random_1_8, Forget20-UFLP_7_3_1-10_1-1_random_1_9, Forget20-UFLP_7_3_1-10_1-1_spheredown_1_1, Forget20-UFLP_7_3_1-10_1-1_spheredown_1_10, Forget20-UFLP_7_3_1-10_1-1_spheredown_1_2, Forget20-UFLP_7_3_1-10_1-1_spheredown_1_3, Forget20-UFLP_7_3_1-10_1-1_spheredown_1_4, Forget20-UFLP_7_3_1-10_1-1_spheredown_1_5, Forget20-UFLP_7_3_1-10_1-1_spheredown_1_6, Forget20-UFLP_7_3_1-10_1-1_spheredown_1_7, Forget20-UFLP_7_3_1-10_1-1_spheredown_1_8, Forget20-UFLP_7_3_1-10_1-1_spheredown_1_9, Forget20-UFLP_7_3_1-10_1-1_sphereup_1_10, Forget20-UFLP_7_3_1-10_1-1_sphereup_1_2, Forget20-UFLP_7_3_1-10_1-1_sphereup_1_3, Forget20-UFLP_7_3_1-10_1-1_sphereup_1_4, Forget20-UFLP_7_3_1-10_1-1_sphereup_1_5, Forget20-UFLP_7_3_1-10_1-1_sphereup_1_6, Forget20-UFLP_7_3_1-10_1-1_sphereup_1_7, Forget20-UFLP_7_3_1-10_1-1_sphereup_1_8, Forget20-UFLP_7_3_1-10_1-1_sphereup_1_9, Forget20-UFLP_7_3_1-1000_1-100_spheredown_1_1, Forget20-UFLP_7_3_1000-2000_100-200_spheredown_1_2, Forget20-UFLP_7_3_1000-2000_100-200_spheredown_1_6.

Research questions

We first consider

  1. How do input statistics affect the number of nondominated solutions?
    1. Does the cost range have an effect?
    2. Does the generation method of the objective coefficients (coef) have an effect?
    3. Does the proportion of objective coefficient vectors that are non-dominated (ratioNDcoef) have an effect?
    4. Can we exclude some values of the input statistics for further study (e.g. some ranges)?
  2. Which instances are hard to solve?
    1. What is the cpu for each instance?
    2. Which sphere generation instances are hardest to solve?
    3. What is the std.dev. within each instance group?
    4. In which subprocedures are the cpu time used?
  3. Which algorithm configuration is best?
    1. Is there a clear winner?
    2. Is the best node selection strategy affected by other algorithm configurations?
      • Are some problem classes solved best with one node selection strategy compared to others?
      • Are ‘easy’ problems solved best with one node selection strategy compared to others?
    3. Does different OB strategies affect the node selection strategy?
  4. How are nodes pruned?

Analysis: Research Question 1 (R1) - How do input statistics affect the number of nondominated solutions?

In this section, relations between |YN| (YN) and various input statistics. How do input statistics affect the number of nondominated solutions?

R1a: Does the cost range have an effect?

Based on the plots we may answer the research question by concluding that range have an effect. A small coefficient range gives a smaller range of possible objective values and hence fewer possible nondominated points. Note a smaller range may result in a larger amount of alternative solutions corresponding to a nondominated point. The speed of the algorithm may be affected by alternative optimal solutions.

For KP and AP scaling the cost don’t affect the size of nondominated points. For UFLP there is an effect due to the way costs are generated so here scaling don’t give the same results (we have to choose a setting here). [Missing comments for IP]

We may conclude that further study only consider range [1,1000] and the two cases for the UFLP with this range.

Note that the number of nondominated points increase with the number of variables which is due to the total possible range of the objectives increase with n (there is room for more points).

R1b: Does the generation method of the objective coefficients (coef) have an effect?

Based on the plots we may answer the research question by concluding that the objective coefficient generation method have a high effect on the number of solutions. Random generation on average produce the lowest amount, then 2box. The two sphere methods seems to give approx. the same results.

Note in general the speed of the algorithm is highly correlated with the number of non-dominated points. That is, instances with a high number of nondominated points are harder to solve. Based on this we may conclude that using one of the sphere methods for further study seems a good choice. We may also consider comparing it against the easiest method.

R1c: Does the proportion of objective coefficient vectors that are non-dominated (ratioNDcoef) have an effect?

This is highly related to the objective coefficient generation method (and the range):

Note for fixed pb and rangeC the ratio approx. increase in the order random, 2box, spheredown and sphereup (with random and 2box (spheredown and sphereup) often close to each other). However it is important to keep in mind that the ratio is highly dependent on the problem class (how variables are linked together). The UFLP have much smaller ratios due to that the costs relate to different aspects of the problem.

Based on the plot we may answer the research question by concluding that the ratio may have a high effect on the number of non-dominated solutions. Given a fixed problem class and objective coefficient range, the ratio may have a high effect on the difficulty of the instance. Note however, that if problem class and range not is known you can not guess the number of ND points.

R1d: Can we exclude some values of the input statistics for further study (e.g. some ranges)?

The above analysis may conclude that instances with a lower number of nondominated points can be generated by

  1. Use a narrow cost range.
  2. Use random cost generation which result in a low ratio.

To have harder instances we may skip these instances and instead use instances with a higher number of nondominated points by

  1. Use a wide cost range (e.g. [1,1000]).
  2. Use a sphere generation method.

Scaling the range don’t have a high effect on most problem classes so fixing a cost range seems okay. If we want to compare against easy problems we may use the first option too.

Analysis: Research Question 2 (R2) - Which instances are hard to solve?

We focus on cpu time (tpstotal) and try to get an overview over difficulty of the instances.

R2a: What is the cpu for each instance?

Let us have a look:

Finally we may plot only the winner cpu times for each instance:

Some observations

  • There is no clear winner among the different algorithm configurations.
  • There seems to be a bigger difference for KP.
  • Hardest instances are the one generated with the sphere methods.

R2b: Which sphere generation instances are hardest to solve?

Let us consider ranges [1,1000] and [1,1000][1,100]:

There is no clear winner. Moreover, the shape of the nondominated set is not affected much by the generation method.

R2c: Is the cpu time the same within each instance group?

By instance group we mean an unique combination of namePrefix, constId, pb, n, coef, rangeC, nodesel, varsel, OB

In general we may have high variation within instance groups.

R2d: In which subprocedures are the cpu time used?

Analysis: Research Question 3 (R3) - Which algorithm configuration is best?

R3a: Is there a clear winner?

Let us find the winner configuration for each instance given some groupings:

Overall breadth.mof.cone win. Note this result is affected by the number of problems in each problem class and the difficulty. For instance if we have many easy instances the winner will be the configuration that solve the easy instances best. Let us have a look at the problem classes:

Note if we look at different problem classes different variable selection methods may win. Do the generation method affect the winner:

If we group by coef, we have different winners. This may have something to do with the hardness of the instance (generation method is highly correlated with number of nondominated points).

Will different methods be used for harder instances (here we use YN as a proxy for hard). Let us try to plot the aggregated percentage winner configuration:

It can be seen that for some problem classes the best configuration change from easy instances to hard.

R3b: Is the best node selection strategy affected by other algorithm configurations?

We consider all instances:

There seems to be no effect. Let us have a look at the hard instances (ranges [1,1000] and [1,1000][1,100]) and the winner configurations:

Moreover if we filter further and only look at the sphere generation methods:

Analysis: Research Question 4 (R4) - How are nodes phantomed?

Conclusions

[Sum up]

Experiments to include in the article